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APPARATUS AND METHOD FOR COMPRESSING BIN A REED IMAGES 



FIELD OF THE INVENTION 



The present invention relates to methods for 
compressing binarized images, generally. 



BACKGROUND OF THE INVENTION 



Arithmetic coding is described in: 

Witten, I. H et al, "Arithmetic coding for data 

compression", Computing Practices, Communications of the 

ACM, Jun 1987, Vol. 30(6); and 

"Arithmetic coding and statistical modeling", 

Dr. Dobb's Journal, Feb. 1991, pp. 16-29. 

The MR decoding scheme is described in CCITT 
Recommendation T.4 and T.6 for Groups 3 and 4 . 

A conventional binarizing technique is de- 
scribed in Foley, J. et al, Computer C-raohics : Princi- 
ples and practice , 2nd Ed., Section 13.1.2, pages 568 - 
573. 

The disclosures of all of the above publica- 
tions are hereby incorporated by reference. 
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SUMMARY OF THE INVENTION 



The present invention seeks to provide an 
improved image manipulation system. 

There is thus provided in accordance with a 
preferred embodiment of the present invention a method 
for compressing binarized images including receiving a 
binarized image and generating a first sequence of first 
code symbols representing the binarized image wherein at 
least one row of the image is represented in run-length 
encoded format, and encoding a portion of the first 
sequence of code symbols using a preliminary encoding 
scheme, thereby to provide a first portion of a second 
sequence of code symbols, and, while encoding, accumu- 
lating the frequency of at least some of the first code 
symbols thus far encoded and generating an additional 
portion of the second sequence using a modified version 
of the code scheme such that at least one subsequent 
code symbol in the first sequence with a large accumulat- 
ed frequency is encoded more compactly in the second 
portion than at least one subsequent code symbol in the 
first sequence with a small accumulated frequency. 

Further in accordance with a preferred embodi- 
ment of the present invention, a modified Huffman coding 
scheme is employed to generate the first sequence of 

first code symbols. 

In accordance with another preferred embodiment 
of the present invention, there is provided a method for 
compressing binarized images including receiving a binar- 
ized image and generating a first sequence of first code 
symbols representing the binarized image including a 
representation of one row of the binarized image and a 
representation of differences between at least one subse- 
quent row and at least one previous row, and encoding a 
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portion of the first sequence of code symbols using a 
preliminary encoding scheme, thereby to provide a first 
portion of a second sequence of code symbols, and, while 
encoding, accumulating the frequency of at least some of 
the first code symbols thus far encoded and generating an 
additional portion of the second sequence using a modi- 
fied version of the code scheme such that at least one 
subsequent code symbol in the first sequence with a large 
accumulated frequency is encoded more compactly in the 
second portion than at least one subsequent code symbol 
in the first sequence with a small accumulated frequency. 

Further in accordance with a preferred embodi- 
ment of the present invention, the encoding scheme used 
to encode the first sequence of code symbols is continu- 
ally modified such that code symbols in the first se- 
quence with a large accumulated frequency are encoded 
more compactly in the second portion than subsequent code 
symbols in the first sequence with a small accumulated 
frequency. 

Still further in accordance with a preferred 
embodiment of the present invention, a modif ied-read 
coding scheme is employed to generate the first sequence 
of first code symbols . 

Further in accordance with a preferred embodi- 
ment of the present invention, a modified modif ied-read 
coding scheme is employed to generate the first sequence 
of first code symbols. 

Still further in accordance with a preferred 
embodiment of the present invention, the method also 
includes binarizing a discrete level image, thereby to 
provide the binarized image. 

Additionally in. accordance with a preferred 
embodiment of the present invention, the method also 
includes binarizing a continuous level image, thereby to 
provide the binarized image. 

Still further in accordance with a preferred 
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embodiment of the present invention, arithmetic coding is 
employed to translate the accumulated frequency of at 
least some of the first code symbols into second code 
symbols . 

There is also provided, in accordance with a 
preferred embodiment of the present invention, apparatus 
for compressing binarized images including a run-length 
encoder operative to receive a binarized image and to 
generate a first sequence of first code symbols repre- 
senting the binarized image wherein at least one row of 
the image is represented in run-length encoded format, 
and an adaptive encoder operative to . encode a portion of 
the first sequence of code symbols using a preliminary 
encoding scheme, thereby to provide a first portion of a 
second sequence of code symbols, and, while encoding, to 
accumulate the frequency of at least some of the first 
code symbols thus far encoded and to generate an addi- 
tional portion of the second sequence using a modified 
version of the code scheme such that at least one subse- 
quent code symbol in the first sequence with a large 
accumulated frequency is encoded more compactly in the 
second portion than at least one subsequent code symbol 
in the first sequence with a small accumulated frequency. 

There is further provided, in accordance with a 
preferred embodiment of the present invention, apparatus 
for compressing binarized images including a binarized 
image compressor operative to receive a binarized image 
and to generate a first sequence of first code symbols 
representing the binarized image, the first sequence 
including a representation of one row of the binarized 
image and a representation of differences between at 
least one subsequent row and at least one previous row, 
and an adaptive encoder operative to encode a portion of 
the first sequence of code symbols using a preliminary 
encoding scheme, thereby to provide a first portion of a 
second sequence of code symbols, and, while encoding, to 
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accumulate the frequency of at least some of the first 
code symbols thus far encoded and to generate an addi- 
tional' portion of the second sequence using a modified 
version of the code scheme such that at least one subse- 
quent code symbol in the first sequence with a large 
accumulated frequency is encoded more compactly in the 
second portion than at least one subsequent code symbol 
in the first sequence with a small accumulated frequency. 

Further in accordance with a preferred embodi- 
ment of the present invention, the binarized image com- 
pressor employs a modif ied-read coding scheme to generate 
the first sequence of first code symbols. 

Still further in accordance with a preferred 
embodiment of the present invention, the binarized image 
compressor employs a modified modif ied-read coding scheme 
to generate the first sequence of first code symbols. 

Additionally in accordance with a preferred 
embodiment of the present invention, the adaptive encoder 
employs arithmetic coding to translate the accumulated 
frequency of at least some of the first code symbols into 
second code symbols. 

Still further in accordance with a preferred 
embodiment of the. present invention, the encoding scheme 
used to encode the first sequence of code symbols is 
continually modified such that code symbols in the first 
sequence with a large accumulated frequency are encoded 
more compactly in the second portion than subsequent code 
symbols in the first sequence with a small accumulated 
frequency. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

The present invention will be understood and 
appreciated from the following detailed description, 
taken in conjunction with the drawings in which: 

Fig. 1 is a simplified block diagram of an 
image manipulation system constructed and operative in 
accordance with a preferred embodiment- of the present 

invention , and 

Fig. 2 is a simplified flowchart illustrating a 
preferred mode of operation in which the MR code element 
frequency accumulation unit of Fig. 1 processes a single 
MR code element in a sequence. 

Attached herewith are the following appen- 
dices which aid in the understanding and appreciation of 
one preferred embodiment of the invention shown and 

described herein: 

Appendix A is a computer listing of a preferred 
software embodiment of the MR coding, arithmetic coding 
and MR code element frequency accumulation units of Fig. 
1 , and 

Appendix B is a computer listing of a preferred 
software embodiment of the arithmetic decoding, MR code 
frequency accumulation and MR decoding units of Fig. 1. 
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DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS 

Reference is now made to Fig. l which is a 
simplified block diagram of an image manipulation system 
constructed and operative in accordance with a preferred 
embodiment of the present invention. 

As shown, a digital representation of an image 
is provided from any suitable source, such as a scanner 
10 which scans a substrate such as a continuous level 
photograph 20, a digital camera 30, a fax machine 40, an 
image creation workstation 50 such as. a Macintosh 
equipped with the Adobe Photoshop software package, or a 
storage medium such as a hard disk 60. The digital repre- 
sentation of the image may be either a continuous level 
image or a discrete level image such as a document or 
other black and white image. 

If the digital representation of the image is 
not binary, the digital representation is binarized, as 
indicated in Fig. 1 by image binarization block 70, using 
any conventional binarizing technique such as those 
described in Foley, J. et al , Computer Graphics : Princi- 
ples and practice, 2nd Ed., Section 13.1.2, pages 568 - 
573. 

The binarized image is then coded by MR coding 
unit 80, using the MR coding scheme described in CCITT 
Recommendation T.4 and T.6 for Groups 3 or 4 . 

The MR coded binarized image generated by MR 
coding unit 80 then undergoes arithmetic coding in arith- 
metic coding unit 90. The arithmetic coding unit 9 0 
receives as input: 

■ a . the sequence of MR code elements which forms 

the MR coded binarized image and 

b . the estimated probability of each MR code 

element, which is provided by an MR code element frequen- 
cy accumulation unit 100. Initially, the estimated proba- 
bilities of all MR code elements are typically taken to 
be equal. However, as the MR code element sequence flows 
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into the MR code element frequency accumulation unit 100, 
the estimated probabilities change based on the number of 
times each MR code element is encountered. 

The sequence of MR code elements typically 
includes code elements of 3 types: 
a> mr control type code elements; 

b. Black run length type code elements; and 

c . White run length type code elements. 

The frequency accumulation unit 100 typically 
receives as input each MR code element and, associated 
therewith, an indication of the type of that MR code 
element. Typically, unit 100 computes the relative code 
element frequency for each code element within its own 

code element type. 

The arithmetic coding unit 9 0 may, if desired, 
be replaced by an entropy encoder or an adaptive Huffman 
encoder. If this is the case, then the arithmetic decod- 
ing unit 110, described below, is replaced by an entropy 
decoder or adaptive Huffman decoder, respectively. 

One software embodiment of arithmetic coding 
unit 90 is described in "Arithmetic coding and statisti- 
cal modeling", Dr. Dobb's Journal, Feb. 1991, pp. 16 - 
29. The above reference also provides a software embodi- 
ment of arithmetic decoding unit 110. 

An alternative implementation of MR code ele- 
ment frequency accumulation unit 100 is described below 

with reference to Fig. 2. 

The output of the arithmetic coding unit 9 0 is 
a very compact representation of the original image which 
is suitable, for example, for compact storage on any 
suitable optical or magnetic medium and/or for rapid 
facsimile transmission, 105, on conventional equipment 
which preferably has a error correction capability, such 

as the V32bis modem. 

The compact representation of the original 
image is decompressed after being transmitted or after 
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being retrieved from archival. To decompress the compact 
representation, the compressed data stream is fed to an 
arithmetic decoding unit 110 which replaces each arith- 
metically coded element with a corresponding MR code 
element according to the frequency of the arithmetically 
coded element. The frequency information is provided by 
an MR code element frequency accumulation unit 120 which 
is typically identical to unit 100. Initially, the esti- 
mated probabilities of all MR code elements are typically 
taken to be equal. However, as the MR code element se- 
quence flows into the MR code element frequency accumula- 
tion unit 120, the estimated probabilities change based 
on the number of times each MR code element is encoun- 
tered . 

The output of the arithmetic decoding unit 110 
is a sequence of MR code elements which is decoded by an 
MR decoding unit 130 using the MR decoding scheme de- 
scribed in CCITT Recommendation T.4 and T.6 for Groups 3 
or 4 . 

The output of MR decoding unit 130 is a decom- 
pressed binarized image which is substantially identical 
to the binarized image generated by image binarization 
unit 70. Fig. 2 is a simplified flowchart illus- 

trating a preferred mode of operation in which either of 
the MR code element frequency accumulation units 100 or 
120 of Fig. 1 processes a single MR code element in a 
sequence of MR code elements. 

If (process 210) there is a decision to reset, 
i.e. to begin accumulating frequencies from zero, then 
the method advances to stage 220. Otherwise, the method 
advances to stage 240. A reset is performed, for example, 
if a new image is to be processed whose characteristics 
are thought to differ significantly from the previous 

image processed. 

In process 2 20, a table is allocated for each 
of the three MR code element types. The number of cells 
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in each table typically exceeds the number of code ele- 
ments of each type, by l. The difference between the 
content of the i'th cell in the table and the (i+l)th 
cell in the table, also termed herein "the i'th 
interval", is indicative of the relative frequency of 
the i'th code element, within its code element type. 

Since there are 9 2 code elements of the White 
Run Length type and of the Black Run Length type, the 
tables for these two types each typically have 93 cells. 
Since there are 9 code elements of the MR Control type, 
the table for the MR Control type typically has 10 cells. 

PROCESS 230: The table contents are initialized 
by generating equal intervals such as, typically, inter- 
vals having a length of 1 . 

PROCESS 240: Input is received: A single MR 
code element from the MR code element sequence represent- 
ing the image, and, associated therewith, its MR code 
element type, is received as input. 

PROCESS 25 0: Unit 100 allows arithmetic coder 
9 0 to arithmetically code the current MR code element, by 
supplying the frequency intervals stored in the table 
corresponding to the current MR code element to the 
arithmetic coder 90. For example, if the MR code element 
is of the MR_control type, the intervals stored in the 
MR_control table are employed. 

Unit 120 allows the decoder 110 to arithmeti- 
cally decode the current MR code element, by supplying 
the same information to decoder 110. 

' PROCESS 260: The appropriate table is updated 
by incrementing by 1 the contents of each cell starting 
from the cell following the cell corresponding to the 
current code element. 

For example, if the fourth MR_control type code 
element is encountered, the contents of the fifth to 
ninth cells of the MR-control table are incremented by 1. 

Preferably, old frequency information is given 
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less weight than new frequency information. One implemen- 
tation of this rule is: 

PROCESS 270: For each type t, each time N t code 
elements of type t have been processed, divide the cell 
contents of the frequency interval table of type t, by a 
suitable number such as 2. Suitable N t values are: 25 6 
for MR control type, 2 048 for black and white run length 
types . 

Appendix A is a computer listing in C language, 
of a preferred software embodiment of the MR coding, 
arithmetic coding and MR code element frequency accumula- 
tion units of Fig. 1. 

Appendix B is a computer listing in C language, 
of a preferred software embodiment of the arithmetic 
decoding, MR code element frequency accumulation and MR 
decoding units of Fig. 1. 

The programs listed in Appendices A and B may 
be run on a conventional computer such as any UNIX com- 
puter . 

It is appreciated that the MR coding described 
hereinabove may, alternatively be replaced by MMR coding 
or other similar coding schemes. 

It is appreciated that the invention shown and 
described herein is suitable for compressing and decom- 
pressing any type of binarized image, such as binarized 
discrete level images or binarized continuous level 
images, also termed herein "halftone images". 

In certain applications, it may be desirable to 
use the compression methods shown and described herein to 
compress only a portion of a binarized image. For exam- 
ple, in medical imaging applications, the compression 
methods shown and described herein may be employed to 
generally losslessly compress the foreground of the 
medical image whereas the background of the medical image 
may be compressed using lossy techniques. 
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It is appreciated that the software components 
of the present invention may, if desired, be implemented 
in ROM (read-only memory) form. The software components 
may, generally, be implemented in hardware, if desired, 
using conventional techniques . 

It is appreciated that the particular embodi- 
ment described in the Appendices is intended only to 
provide an extremely detailed disclosure of the present 
invention and is not intended to be limiting. 

It is appreciated that various features of the 
invention which are, for clarity, described in the con- 
texts of separate embodiments may also be provided in 
combination in a single embodiment. Conversely, various 
features of the invention which are, for brevity, de- 
scribed in the context of a single embodiment may also be 
provided separately or in any suitable subcombination. 

It will be appreciated by persons skilled in 
the art that the present invention is not limited to what 
has been particularly shown and described hereinabove. 
Rather, the scope of the present invention is defined 
only by the claims that follow: 
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C:\ARIK\C0MPRES3\PTNTSRC\AGCMP.C - TtlU Aug 25 09:03:04 1994 
AGCMP COMPRESSION UTILITY 

me following sources implement tne suggested compression technique 
previously described. 

me agcmo program compresses a raw binary file (witn no Headers and with 
a tcnown line length) to a compressed file on the dlsic 

FILES: 

agcmp.c - the main loop for compression. Converts the raw file to 
MR codes and passes tnem to the arithmetic coder. 



me following sources are common to both programs - agcmp and 
agexp (Decompression) and handle the statistical estimation 
(element frequency accumulation) and tne aritnmetic coding: 

amdl c - Statistical estimation. Based on a source from Dr. Dobbs 
journal. February 1991, "Antnmatic coding and statistical 
Modeling- by Mart: R. Nelson, but modified to fit compression 
of MR codes. 

acoder.c, abitio.c - implement the aritnmetic coder, based on or . Dobbs 
journal. 

COMPILATION: 



agcmp: cc agcmp.c amdi.c acoder.c abitio.c 
FURTHER INFORMATION about agcmp.C: 



AUTHOR: Arlk cordon 

INPUT: A rastered file (No headersl) with 1728 binary pixels per line 
OUPUT: compressed file. 
USAGE: agcmp IN RLE OUT FILE 

Desc : This source'opens a rastered binary file, converts it to codes 
according to MR standard, and passes the codes to the aritnmatic 
coder, me compressed file is constructed from a header (see agcmp.h) 
and tne compressed entropy coded stream. 



/include <stdio.h> 
/include <stdiib.h> 
/include <string.h> 
/include <fcntLh> 
/include <memory.h> 
/include <malloch> 
/include <sys\types.h> 
/include <sys\stat.h> 
•/include <dos.h> 
/include "acoder.h" 
/include "amodeLh" 
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^include -aDitio.h" 
/inciude -agcmp.rr 

static cnar Mastjinejn_prev_strip; 

long agcmptcnar •infile, char •outfile); // returns size in bytes 
long add_fiie(cnar Ma int out); 
long mr_compress stripccnar PufiH. int lines); 
void modified READtcnar *prev,cnar *cur, cnar 'next int length); 
void one_line_modified_read(cnar -prev, cnar •curr, Int length); 
void put_rl(int len f int color); 
void put~code(int ten, int colon; 
void put EOLO; 
find nextcint color, int pos. cnar Mine, int len); 
void "erase _singis_dots(char *prev, cnar *curr, cnar *next int len); 

main(int argc. cnar 'argvCD 
{ 

if (argc ! » 3) { 4 
fprintfcstderr, "\nusage: %s iMC_file_name G3_output_ftle_name\n\ argvtOD; 

exit®; 

pnntfrtotaijDytes = %id\n\ agcmp(argv(U argvt2D); 

} 



long agcmp(cnar • Infile, cnar *cutfile) // returns size in bytes 
{ 

long total_bytes-OL; 

unsigned int i, j«=0, k f fiie_count; 

cnar *bufi; 

unsigned size In bytes; 
AG HEADER ag header; 
Intfdl, fdtmpf 

If ((fdl-opendnflle, 0_ROQNLY | 0_BINARY, SJREAD | SJWRITE)) < 1) 

BigErro, "cmr: canr open"); 

/• INITIALIZE ARITHMETIC CODER V 

initiallzejnodelO; 

Init mr rhodelO; 

initTalize.outDut.bitstreamcoutfiie, &ag_header, sizeof(AG_HEADER)); 
initialize_arithmetic_encoderO; 

If ((bufi - mal!OCSTRIP_SlZE* BYTES J'ERJJNB) NULL) 

BigErrO, •accmp: no rnenm; 

If ( aastjinejnjjrevstrtp - maiioc<PELSJ>ERJJNB) - - NULL) 
BigErrG, T ag"cmpi: no mem"); 

mems'ettlastjlnejn_prev_stiip, a PELS_PERJJNS; 
ag^neader.number^f Jinesjnjiie - 0; 

/* Main loop for Compression */ _ , t%t _ f 

while ((file_count-read(fdL bufl. STRIP_SIZE*BYTES_PERJJNB) > - BYTESPERJJNE) { 
. f printf (stdenr, "COMPRESSING STRIP #%d\r, J + +); 
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C:\ARIK\C0MFRSSSvPTNTSROACCMP-.C - ThU Aug 25 09:03:04 1994 

ag neader number of Jines in file + - file_count/BYTES_PER_UNE; 
mrjrompressjtrip(bufi, file_count/BYTES_PER_UNE); 
_heapminO; 

fprintftstderr, m \m; 

freeoasrjine_in_prev_strip); 

free(Pufi); 

ciose(fdi); 

jieapminO; 

/• Flnisn and close arithmetic ceding */ 
code EOFO: 

flusn arithmetic encodero; L^nerm 
total" bytes - fiusn_outDut_Dttnream(&ag_header, sizeof<AG_HEADER)); 

free3 md L Duf50: 
return(total bytes); 

} 

/• compress one strip (arbitrary size, defined in agcmp.ro V 

long mr_compress_strip(char bufiD, int lines) 

char arrayGUPELS.PER.UNE3; 
unsigned ic, i, curjine«2, off; 



// Fill first 2 lines in array. 



for (K ~ 0; k < min(2, lines); k +- *) 

for(I«0;i<PELS PER UNE1++) ^ nn/mw t ^ 

arraYUC + 1Hil - t(b Uf i tic * B YTE3_PER_LIN E + 1/81 & (1 < < (7-a%8))» !- 0); 

if (lines > 0) // mere is at least 1 line to compress 
modified.READ(NULU torravtuiai. &arrayl2U0l rei£_FERUNE>; // First array compression 

/♦ convert packed bits to "1 bit per byte" format ♦/ 
while (cur line < lines) { 
memcpy(&arrayton01, aarravtlKOl. 2 • PELSPERJJNB; 
for (I-0; i<PELS PERJJNE; i + +> { 
off - cur line"* BYTES PERJJNE +I/8; 
if (bufiloffl 0) { 
memset(&(arrayt2![ID, 0, 8); 
I+-7; 
continue; 

if (bufiloffl - - 255) { 
memset{&(2rT3Yt21tiD, 1, 8); 
I+-7; 
continue; 



arravt2im - ((bufiloffl & (1 < < (7-a%ffl») I- 0); 



} 

cur llne + +; _ 
/• compress one line (given the previous line ) / 
/• ewe also provide the next line in case some filtering is 
desired) */ 

modified REAix&arravtonoi. iarravtinoi. &arravt2U01. pels_per_UNB; 
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/• do last line V 

lf Semcpv(&arravtOUOL &arravmiOl. 2 ♦ pe ^- re %"fi B ; eD IIMin 
modified READC&arravtOHOL larravHUOl. NULL. FELS.PER.LlNB; 

} 

returnd); 

} 

void modified READCcnar -prev.cnar -cur, cnar -next int length 
{ 

int j; 
long i; 

curtoi = WHFTE; // don't accent a black pixel on line beginning 

if (prev= -NULL) { 
onejine_modlfied.readaastjinejn_Drev.stnp. cur, lengtru; 

return; 

memcpYUast line in_prev_strip, cur, PELS_PERJJNB; 
one line modlf ted_read(prev, cur, lengtn); 

} 

/• Here we actuallt translate the line to MR codes + Run-Lengtns, 
and pass tne codes to trie aritnmetic coder •/ 
. void oneJine_modIfied_read(cnar *prev, cnar *curr, Int lengtru 

* intaO, ai, a2. bi, b2,a0_color; 
aO « -1; aO_cotor - WHITE; 

// -curr - WHITE; // donx accept a blacic pixel on line beginning 

do{ ' 
ai - find nextaaO color, aO + l, curr, length; 
a2 » find'nexttaOjxrtor, ai +1, curr, lengtn); 

If (aO - - -1) 

bi » find nextaaO_color, aO+1, prev, length); 
else If (prevtaOl - - aO color) 

bi - find.nextQaOjibior, aO + 1, prev, lengtn); 
else { 

bi - find next(aO color, aO + 1, prev, lengtn); 
M - find'nextuao color, bi +n, prev, lengtn); 

} 

b2 - findjiextcaO_coior, bi +1. prev, lengtn); 
// code it 

' f %rZ>3ttJ5£»£% -%d. a2 -*d. bi -%d. t* -%<mn-. aa an. *. M. Hi; 

COde 1(MR CONTROL, PASS); 
aO - b2; 
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} else if (abscai-bi> < = 3) { // vertical mode 
switcn (ai-tm { 
case 0* 

//printfrvo (aO»%d, ai = %d, a2 = %d, bl = %d, b2=%d)\n", aO, a1, a2, bl, b2); 
COdeJ(MR_CONTROL VO); 
break; 
case 1 * 

//printfCVRl (aO»%d. ai = %d, a2=%d, bl = %d, b2 = %d)\n\ aO ( ai, a2, bl, b2); 
C0de_1(MR_C0NTR0L, VR1); 
break; 
case -1: 

//prtntfrvLi (aO«%d, ai -%d, a2-%d, bl = %d, b2 = %d)\n\ aO, ai, a2, bl, b2); 

COde J (MR_CONTROL VL1); 
break; 
case 2: 

//printfn/R2 CaO=%d, ai =%d, a2 = %d, bl «%d, b2 = %d)\n", ao ( ai ( a2, bl, b2); 
C0de_1<MR_C0NTR0U VR2); 
break; 
case -2: 

//Drintft"VL2 (aO-%d. ai «%d, a2-%d, bl »%d, b2 = %d)\n". aO, ai, a2, bl f b2); 
codej (MR_CONTROL VL2); 
break; 
case 3: 

//print«"VR3 (aO«%d, ai «%d, a2=%d, bl - %d, b2~%d)\n - , aO, ai, a2, bl f b2); 

COdeJ (MR_CONTROU VR3); 
break; 
case -3: 

//printfrVL3 (aO-%d f ai -%d, a2«%d, bl = %d, b2«%d)\n\ aO, 31, 32, bl f b2); 

C0dej(MR_C0NTR0L, VL3); 

break; 



} 

aO - ai; 
} else { // HORIZONTAL MODS 
if (20 - - -1) 
ao - 0; 

//printfrHORizorrrAL color - %d, leni - %d, LEN2 « %d <aG-%d)\n\ aojrotor, ai-ao, a2 
--»> -ai,aO); 

COde 1<MR_CONTODL HOR); 
put^ruat-aO, aOjrolor); 
put~ri(a2-ai, lad^coion; 
aO - a2; 

} 

If (aO < iengtn) 
aO.coior = currtaOl; 
} whiielaG < iengtn); 

//printfrEOLMT);" 

//put EOLGlne); /* we dorvt need It because next aO Is beyond tine */ 

} 

/* converts a single run-tengtn (unlimited iengtn) to several runs 

according to MR <Group3,4) spec V 
void put ruint len, Int color) 
{ 

If Qen > 63 ) { 
put_code(Uen / 64) + 63, color); 
len-- den/ 64) • 64; 
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put codeden, color); 

} 

/• codes one legitimate run */ 
void put_code(lnt len, mt colon 

* code 1 (color, bw^SYMBOLS- len- 1); 
} 

/• we do not need tnis if we Know tne line lengtn in advance */ 
void put_EOL0 

{ //COdeJCWHITE. EOU; 
//code 1 (BLACK, EOU; 

} 

/* finds tne next color interchange V 

find nextdnt color, int pos, cnar Mine, int len) 

{ 

int I; 

cnar *ptr; 

if (pos > ien-1) 
returnden); 

if ( (ptr = memchrdine + pos, color, len-pos)) « - NULL) 

return len; 
else 

return (ptr-line); 

} 

BigErrdnt n, cnar *s) // too many bits in strip. 

{ printfrErr %d - %s\ n, s); 
exlt<9); 

} 

/• codes 1 symbol (Control or Black Run or wnite Run) */ 

code_iant mode, Int a 

{ 

SYMBOLS; 

convertJntto_symbol( c, &s, mode); 
encade~SYmdo"t( &s ); 
update'modeKO; 

} 

/• to finish with the arithmetic coding: */ . 

code EOFO 

( 

SYMBOLS; 

convertJntto_symboi(EOF, &S, MR_CONTROU; 
encode "sym dole &s ); 

} 
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/• Desc: Header file mainly for agcmp.c. agexp.c •/ 
/• AUTHOR: Arik Cordon */ 

/• This is a header that appears at tne begining of the compressed file V 
typed ef struct AC_HEADER { 

long total J>vtes; 

long number ofjlnes ln_flle; 

} AC HEADER; 



/• in our impiementaion we assume a standard fax document with 1728 pixels 

per line */ 
/define PELS PER UNE 1728 
/define BYTES PER UNE 216 



/define STR!P_SIZE 100 

/define WHITE 0 
/define BLACK 1 
/define MRJIONTROL2 

/define MR SYMBOLS 9 
/define BW_SYMBOLS 93 

/define VO 8 
/define PASS 2 
/define VL1 3 
/define VR1 4 
/define HOR 5 
/define VL2 6 
/define VL3 7 
/define VR2 1 
/define VR3 0 



/define beepo putchCD 
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/* 

* Listing 9 -.amdl.c 

* AUTHOR: Originally from or. DQDbs, Feb 1991, Substantially modified 

by Arik cordon. 

* mis is tne statistical estimation module for compressing 

* MR codes. There are tnree types ot codes: MR_CONTROL, black Run-length 

* and white Run-Length. For eacn type we nave a seperate statistical 

* estimator of order 0 for run-iengtns and order 2 for MR_CONTROL 
♦ 

* This is a relatively simple model. For eacn symbol type, 

* tne totals for ail of tne sym&cis are stored In an corresponding 

* array (e.g. 'mr storage"). This array has valid indices from -1 

* to Ni. The reason for having a -1 element Is because tne EOF 

* symbols is included In tne tsoie, and it nas a value of -1. 

* (Ni = number of different symbol for eacn type) 
• 

* The total count for all tne symbols is stored in totaisiNIl, and 

* tne low and high counts for symbol c are found in •arraytd and 

* arraylc + 11. 
V 

/include <stdio.h> 
/include <stdlib.h> 
/include <mallocn> 
/include <io.h> 
/include <errno.n> 
/Include <fcntl.h> 
/include <sys\types.h> 
/include <sys\stat.h> 

/include 'ACCMP.H- 
/include "acoderjr 
/include "amodeLh" 

/• 

* in order to create an array with Indices -1 through num_of_symbols, I have 

* to do this funny declaration, totalsl-11 - « storagelOl. 

V 

snort Int •*mr_storage; 
snort Int •wt_storage; 
short int •bljtorage; 
short int 'totals; 

static Int num_of_symbols, maxmum_scale; 
static Int prev7prev1; 

/* 

* When the mode! Is first started up, eacn symbols has a count of 

* 1, which means a low value of c + 1, and a high value of c+2. 
*/ 

void initiaiizejnodelO 
{ 

Int i, j, order_2_symbols; 

prev « prevl - 0; 
num_of_symbols - MR_SYM50LS; 
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order 2 symbols = num cf symbols • num_cf_symbols; 
mr storige ~ (int **) manccisizeofcint *) * (order_2_symbols + 1)); 
for~(i = 0; i< order 2 symbols; i + 
mr_storagem = mailocisizeof(int) * (num_of_symbots + 2)); 

for 0 = 0; j<order_2_symbols; j+ +> { 
totals « &(mr_stbrageUHiD; 
for ( i « -1 ; i < = num_of_symbols ; i + + ) 
totals! 1 1 - i + 1; 

} 

num of symbols = bw SYMSOLS; 

wtjtorige « maltocunum_cf_symbols-!-2) * sizeof(Int)); 

totals « &twt_storage!lD; 

f or ( i - -1 ; i < = num_of_symbois ; I + + ) 
totaistil -1 + 1; 



bl storage « mailoc«numj2f_symbols+2) * sizeofdnt)); 
totals « &(bl_storagetiD; 

f or ( I « -1 ; i < « num_of_symbols ; i + + ) 
totatsdl - i + 1; 

} 



* Updating tne model means incrementing every single count from 

* tne high value for tne symbol on up to tne total. Then, tnere 

* is a complication, if tne cumulative total has gone up to 

* tne maximum value, we need to rescale. Fortunately, the rescale 

* operation is relatively rare. 
V 

void update modeK Int symbol ) 
{ 

int i; 

for ( symbol + + ; symbol < = num_of_symbols; symbol + + ) 

totals! symbol ]++; 
if ( totals! num_of_SYmbc(s l « = maximum_scale ) 

for ( i - 0 ; i < » num of symbols ; i + + ) 
{ 

totaistil/- 2; 
if ( totals! 1 1 < = totals! 1-1 1 ) 
totals! ii = totals! ki ] + 1; 

} 

} 



• Finding tne low count high count, and scale for a symbol 

• is really easy, because of tne way the totals are stored. 

• This is the one redeeming feature of the data structure used 

• in this implementation. 
*/ 



WO 96/12245 



PCT/US95/13296 



2 3 



C:\ARIK\C0MPREES\PTNTSROAMDLC - TTlU Aug 25 09:04:58 1994 

int convertJnt_to_symbol( int c. SYMBOL # s, int mode ) 

* switcncmode) { 
case WHITE: 

totals « wt storage + 1; 

num_of symbols « BWJYMBOLS; 

maximum_scaie = 2043; 

break; 
case BLACK; 

totals = bl storage + 1; 

num of_symbols = BW_SYM80LS; 

maximum_scale = 2048; 

break; 
case MR CONTROL 

num bf_symbois - MR SYMBOLS; . 

totals = mr_storaget(previ * num_of_svm&ols + prev)] + 1; 
previ = prev; 
prev « c; 

maximurnjcale « 256; 
break; 

} 

s->scaie = totals! num_of_symbols 1; 
s->low count = totalslcl; 
s->nign_count » totaistc+1 1; 
returnc 0 ); 

} 



* Getting tne scale for tne current context is easy. 
*/ 

void get symbol_scaie( SYMBOL *s, int mode, int prev, int prevD 
{ 

switcn(mode) { 
case WHITE: 

totals - wt storage + 1; 

num of_symbois - bwjymbols; 

maximum_scale « 2048; 

break; 
case BLACK: 

totals - bi_storage + 1; 

num of symbols « bwjymbolS; 

maximum_scate - 2048; 

breaks- 
case MR_CONTROL 

num of .symbols - MR SYMBOLS; 

totals - mr_storaget(prevl * num_of_symbols + prev)i -h 1; 

maximum_scaie « 256; 

break; 

s-> scale - totalst num of symbols I; 

} ~ ■ • . 

* During decompression, we nave to searcn tnrough tne table until 

* we find tne symbol tnat straddles tne "counr parameter, wnen 

* It is found. It is returned. Tne reason for also setting tne 
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. nign count and low count is so tnat symbol can be property removed 

* from tne aritnmetic coded input. 

int convert jymboLtoJntt int count, symbol *s> 

{ 

int c; 

for ( c = num.of.symbois-1; count < totalst c I : c-> 

s->nign count = totalslc + 1 1: 
s->iow_count = tota |sl c t 
returnc c ); 

} 

/• rne following is an optional module, tnat «n MaUz « J.® 
e S ?Snaaon tables witn predefined values, it can slightly improve 
compression of small files */ 

init mr modelO 
{ 

int i; 

update Jnitial mr modeKVO, 6); 
update initiafmr modeUVU, 2); 
update'inltlai mr modeKVRl, 23; 
updatelnitiai mr modeK HOR, 23; 
update'initiaf mrjTiodeK PASS, 13; 

} 

updatejnltiai jnrmodeK int symbol, Int count ) 
* int I prev, previ, j; 

num.of .symbols - MR.SYMBOlS; 
maximum.scale - 256; 

for (prev « 0; prev < num.of .symbols; prev ++3 
for (previ « 0; previ <num_of .symbols; previ + +H 
totals « mr storageKprevi * num_of_symbols + prevJl + 1, 
for g-0;J<count;J + 4-) 
update modeKsymboO; 

} 

) 

free.amdl.bufsO 

* Int l, order_2_symbols; 

SSr'SKSES - 3lSSSS>i" • num.of.svmbo* 

for a -0; i <order_2_symbots; l+ +) 

free(mr.storagelID; 
free(mr.storage); 

num.of .symbols - BW.SYMBOLS; 
free(wt "storage); 
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free(Dl storage); 
// neapminO; 
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/• 

• Listing a - amodei.h 

• Tnis file contains all of tne function prototypes and . 

• external variable declarations needed to interface with 

* tne modeling code found in amdi.c. 
V 

/• 

* Eternal variable declarations. 
V 

extern Int max_order; 
extern Int fiushing_enabled; 

♦ prototypes for routines tnat can de called from MODEL-X.C 
•/ 

void initialize jnodeK void ); 

void update modeK Int symbol); i 
int convert Int to symbou Int symbol, symbol *s, Int mode ); 
void gSr^ymbo-i sEalet SYMBOL -s, Int mode, int prev, int prevl ); 
int convert_symboi_to Jntt Int count SYMBOL *s ); 
void add cnaracter_to_model( int c ); 
voidflusn modeuvold); 
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/• 

• Listing 2 -coaer.c 

• SOURCE: Dr. Dobbs journal. Feb 1991 + minor modifications by 

Arik coraon 

• mis file contains tne code needed to accompiisn arithmetic 

• coding of a symbol. All tne routines in tnis module nee « 

• to know in order to accompiisn coding is what tne probabilities 

• and scales of tne symbol counts are. This information is 

• generally passed in a symbol structure. 

• mis code was first pubiisned by lan H. witren. Radford M^Neal 

• and John C. Cieary in -communications of the ACM in June 198/, 

• and has been modified slightly. 
*/ 

/include <stdio.h> 
/include "acoder.h" 
/include -abitlo.h" 
/include 'ACCMP.H- 

• Tnese four variables define tne current stare of tne arithmetic 

• coder/decoaer. mey are assumed to be 1 6 bits long. Note tnat 

• by declaring tftern as snort ints. tney will actually be 16 bits 

• on most 80X86 and 680X0 macnines. as well as VAXen. 
♦/ 



•/ 

v 
•/ 



static unsigned snort int code; r me present input code value 
static unsigned snort int low; /* Start of tne current code range 
static unsigned snort int high; /• End of tne current code range 
long underflow_bits; /• Number of underflow bits pending / 

mis routine must be called to initialize tne encoding Process. 

• Tne hign register is Initialized to all 1s, and it is assumed tnat 

• It nas an infinite string of 1s to be shifted Into tne lower bit 

• positions wnen needed. 

void Initialize aritnmetic_encoderO 
{ 

low - 0; 
high - Oxfffn 
underflow bits - 0; 

} " . 

'« mis routine is called to encode a symbol, me symbol is passed 

• Etne 'Symbol structure as a low count, a high count, and a i range 

• instead of tne more conventional probability ranges me encoding 

• process takes two steps. First, tne values ™J*£*^%£* me 

• updated to take Into account tne range restrirto^eated by tne 

• new symbol, men. as many bits as possible are shifted out to 
- tne output stream. Finally, hign and low are stable again and 

• tne routine returns. 
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V 

void fastcaii encode jymboK SYMBOL *s ) 
{ 

long range; 

mese three tines rescale hign and low for the new symbol. 
V 

range = (long) ( high-low ) + 1; 
hign = low +■ (unsigned snort int) 

((range • s->high_count) /s-> scale • 1 ); 
low « low + (unsigned short Int) 

((range * s->iow_count)/s->scale); 

This loop turns out new bits until high and low are far enough 

• apart to have stabilized. 
V 

f or ( ; ; ) 

if this test passes, it means that the MSOiglts match, and can 

* be sent to tne output stream. 
*/ 

If ( ( high & 0x8000 ) - - ( low & 0x8000 ) ) 

output bittnigh & 0x8000); 
while ( underflow^ its > 0 ) 

oircput_bit(~high & 0x8000); 
underflow bits-; 

} 

} 

if this test passes, the numbers are in danger of "ndefflw because 
* the MSDigits donx match, and the 2nd digits are Just one apart. 

1 else if ( ( low & 0x4000 ) K high & 0x4000 )) 

* underflow_bits + - 1; 
low & = 0x3 fff; 
high | - 0x4000; 

} 

else 
return; 
low < < - 1; 
high < < - 1; 
high | - 1; 

} 

} 

At the end of the encoding process, there are! ««^n««a nt 

* bits left in the high and low registers, we output two bits, 

• plus as many underflow bits as are necessary. 
*/ 

void flush_arithmetic_encoder( ) 
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} 



output Jaitdow & 0x4000 ); 
underflow bits+ +; 
while ( underfiow_bits- > 0 ) 
output Pitt - low & 0x4000); 



* wnen decoding, wis routine is called to figure out which symla? 1 

* is presently waiting to be decoded, mis routine exoe« t^et 

• tne current model scale in tne s->scaie parameter, and It returns 

♦ a count tnat corresponds to tne present floating point coae: 



* code * count/ s-> scale 
V 

int get_current count( SYMBOL *s ) 
{ 

long range; 
snort int count; 



range « (long) ( nigh - low ) + 1; 
count «= (short Int) 

((((long) ( code - low ) + 1 ) • s->scaie-1 ) / range ); 
return* count); 

} 

* This routine is called to initialize the state of tne arithmetic 

* decoder. This involves initializing the nigh and low registers 

* to their conventional starting values, plus reading the first 

* 16 Pits from tne input stream into tne code value. 
♦/ 

void initialize aritnmetic_decoder< ) 
{ 

Int I; 

code « 0; 

ford - 0;I < 16;l++) 
{ 

code < < - 1; 
code + - input PltO; 

} 

low - 0; 

high - Oxffff; 

} 

/* 

* Just figuring out what tne present symbol Is doesn't remove 

* It from tne Input Pit stream. After tne character nas been 

• decoded, this routine has to be called to remove It from tne 

♦ input stream. 
*/ 

void remove symbol from_stream(SYMBOL *s) 
{ 

long range; 
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* First, tne range is expanded to account for tne symbol removal. 
•/ 

range = CiongX high - low > + 1; 
high « low + (unsigned snore int) 

« range • s->high_count)/s->scaie-l ); 
low » low + (unsigned snort ino 

((range * s->low_count) / s->scaie ); 

Next, any possible bits are snipped out 

V 

f or ( ; ; ) 
{ 

/* 

• if tne MSDtgits matcn, tne bits will be shifted out. 
♦/ 

if ( ( high & 0x8000 ) - - ( low & 0x8000 ) ) 

{ 
} 

♦ Else, If underflow is threatening, shift out the 2nd MSDlgit. 

else If ((low & 0x4000) - = 0x4000 && (high & 0x4000) - - 0 ) 

^ code * = 0x4000; 
low &- 0x3 fff; 
high | = 0x4000; 

} 

/* 

* Otherwise, nothing can be shifted out so I return. 
V 

else 
return; 
low < < - 1; 
high << = 1; 
high |- 1; 
code < < - 1; 
code + - InputbltO; 

} 

} 
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/• 

* Listing 1 - acoder.n 
♦ 

* mis neader file contains tne constants, declarations, and 

* prototypes needed to use tne aritnmetic coding routines. These 

* declarations are for routines tnat need to interface with the 

* aritnmetic coding stuff in acoder.c 
• 

•/ 

/define maximum_scale 2048// 16383 /* Maximum allowed freauency count V 
/define ESCAPE 256 /• Tne escape symbol V 
/define done -1 /* The output stream empty symbol */ 
/define flush -2 /• The symbol to flush the model */ 

/• . , 

* A symbol can eitner be represented as an fnt or as a pair of 

* counts on a scale. This structure gives a standard way of 

* defining it as a pair of counts. 
V 

typedef struct { 

unsigned snort int low^count; 
unsigned short int higivcount* 
unsigned short Int scale; 

} SYMBOL; 

extern long underflow bits; /* The present underflow count in ♦/ 
/* tnearltnmetic coder. */ 

/* 

* Function prototypes. 
•/ 

void initialize arithmetic jiecoderO; 

void remove lYmbol_frorn_stream( SYMBOL *s ); 

void initialtze_aritnmetic_encccer< void ); 

void encodejymboK symbol *s }; 

void fiush_arlthmetic_encoderQ; 

int get current_count( symbol *s); 
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/* 

* Listing 4 - abitlo.c 

* SOURCE: Dr. Dobbs Journal, Feb 1991 + minor modifications by 

Arik Cordon 

* mis routine contains a set of bit oriented i/o routines 

* used for arithmetic data compression, me important fact to 

* Know about tnese is tnat tne first bit is stored in tne msb of 

* tne first byte of tne output tivce you mignt expect. 
* 

* Both input and output maintain a locai buffer so that they only 

* nave to do block reads and writes, mis is done in spite of the 

* fact tnat c standard I/O does tne same thing, if these 

* routines are ever ported to assembly language the buffering 

* will come in handy. 
* 

•/ 

/include <stdlo.h> 
/include <std!Ib.h> 
/include •acoder.h" 
/include "abitio.rr 

/Include -ACCMP.H" 

/deflne BUFFER SIZE 81 92 

static cnar 'buffer; /• mis is the i/o buffer ♦/ 
static char *current_byte; /* Pointer to current byte / 

static int output mask; r During output this byte */ 

/* contains tne mask tnat is */ 
/• applied to the output byte*/ 
/* If the output bit is a 1 V 

static int inout_bytes left /• During input these tnree •/ 
static Int input Jaltsjef t ' /* variables keep trade ofmvv 
static int past_eof; /• input state, me past_eofV 

/* byte comes about because •/ 
/* of the fact tnat there Is */ 
static long total bytes; /• a possibility the decoder V 

/* can legitimately ask for V 
/• more bits even after the */ 
/* entire file has been V 
/* sucked dry. */ 



static FILE * stream; 



/* 

* mis routine Is called once to initialze tne output bltstream. 

* All It has to do Is set up the current.byte pointer, clear out 

* all the bits In my current output byte, and set the output mask 

* so It will set the proper bit next time a bit is output 

void initiallze.output.bltstreamicnar •file, void 'header, unsigned int header.size) 

* buffer - matlOC<BUFFER_SfZE-r2); 
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if (DUffer = - NULU { 

printf("\niobs:no mem\n"); 

exit(9); 

total bytes - OU 

curreht_byte = buffer; 

♦curreritjDYte - 0; 

output mas* = 0x80; 

stream"- fopen(file, *wb*); 

setvbuft stream. NULU IOF3F, 3192 ); 

total bytes + - fwritetineaGer, i t neaderjize, stream); 

//printfrtotai bytes » %id\n*\ tctatjDytes); 

} 

/♦ 

• me output bit routine just nas to set a bit in tne current byte 

• if requested to. After tnat, it uodates tne masic if tne mas* 

• snows tnat tne current byte is filled up, it Is time to go to tne 

• next character in tne buffer. If tne next cnaracter is past tne 

• end of tne buffer, it is time to fiusn tne buffer. 
V 

void output blt(intbit) 
{ 

if ( bit) 

*current_byte | = outputmask; 
output masic > > ~ 1; 
if ( output mask - - 0 ) 

{ ' Mn 

output_masK = 0x20; 

current byte++; 

if ( current_byte - » ( buffer + BUFFER_SlZE ) ) 

* total bytes + - fwrite( buffer, 1, buffersize, stream ); 
current.byte - buffer; 

* current byte - 0; 

} 

} 

/* 

♦ wnen tne encoding Is done, tnere will still be a lot of bits and 

• bytes sitting In tne buffer waiting to be sent out. mis routine 

* is called to clean tnings up at tnat point. 
•/ 

long flush jDutput.bfetreamcvold -header, unsigned int header_slze) 

( total bytes + - fwrttet buffer, 1, <size_pc current_byte - buffer ) + 1, stream ); 
current byte - buffer; 
fseeicstream, OU SEEK_SED; 
memcpy (header, &totai.bvtes. stzeof (long)); 
fwrite(header, header jize, 1, stream); 
fciosecstream); 
freecbuffer); 
heapminO; 
return (total bytes); 
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♦ Bit oriented input is set up so tnat tne next time tne input bit 

• routine is called, it will trigger tne read of a new block, mat 

• is wny input_bits_lef t is set to 0. 

void initiaiizejnput.bitstreamtcnar -file, void -header, unsigned int neader.size) 

* buffer = matloc(BUFrSRJlZS-r2); 
if (buffer = = nulu { 

printfr\niibs:no mem\nl; 
exit(9); 

input bitsjefc - 0; 

input'bvtesjeft = 1; 

past iof - 0; 

stream « fopen(f1le, Yd*); 

setvbuf( stream, NULU JOF3F, 8192 ); 

fread (header, 1, header jizs, stream); 

} 

closejnput_bitstreamo 

* freeCbuffert; 
heapmlnO; 
fclose(streamJ; 

} 

mis routine reads bits in from a file, me bits are all sitting 

* in a buffer, and tins code pulls them out, one at a time. w n en tne 

* buffer has been emptied, mat triggers a newfile read, . and , tne 

* pointers are reset mis routine is set up to allow »rttvo dummy 

* pytes to be read in after tne end of file is rea cned JJJ "«JJ® t 

* we nave to keep feeding bits into the pipeline to be decoded so that 

* the old stuff tnat Is 16 bits upstream can be pusned out. 
•/ 

lntinput_bitO 

* if ( lnputbltsjeft 0) 

* current byte + +; 
input bytes left-; 
input bits left - 8; 
if ( input jrytesjeft - - 0 ) 

{ input bytesjeft - freadC buffer, 1, buffsr.size, stream ); 
If ( input bytesjeft - - 0 ) 
{ 

if (past_eof ) 

{ . f prlntK stderr, -Bad input flie\n" ); 
exit! -1 ); 

} 

else 
{ 
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past eof - 1: 
lnput_bytesjeft - 2; 

} 

currentbvte - buffer; 

} 

ffiS d ( l !^S5nt.bvce > > Input Wtsjef t > & 1 >: 
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• Usting 3 - abltio.n 

• mis header file contains the function prototypes needed to use 

• the bltstream i/o routines. 

V 

volS -Ale. void -header , un^gned int neaderjize); 

long flush output bitstreamcvoid • header, unsigned int headerjize), 

£'ld ?nS&!£«!Si«am« C Mr -file, void -header, unsigned Int header.sfee,; 
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ACEXP DECOMPRESSION UTILITY 



Tne agexp program decom cresses files created by agcmp to 
a binary ratsterized file (no neaders) on tne disk. 

(File size is fixed and determined in agcmp.h) 

FILES: 



agexp. c - tne main loop for decompression. Retrieves MR codes 
from tne arithmetic coder and re-generates tne raw 
binary file. 



Tne following sources are common to botn programs - agcmp and 
agexp (Decompression) and handle tne statistical estimation 
(element frequency accumulation) and tne arithmetic coding: 

amdi.c - statistical estimation. Based on a source from Dr. Dobbs 
journal, February 1991, "Arttnmatic Coding and Statistical 
Modeling" by Marie R. Nelson, but modified to fit compression 
of MR codes. 

acoder.c, abitio.c - implement tne arithmetic coder, based on Dr. Dobbs 
Journal. 

COMPILATION: 



agexp: cc agexp.c amdLc acoaer.c abitio.c 

FURTHER INFORMATION about agexp.c. 



AUTHOR: Arik cordon 
INPUT: compressed file. 

OUPUT: A rastered file (No headerso witn 1728 binary pixels per line 

USAGE; agexp COMPRESSED_RLE_NAME RASTE R_FI LE_NAM E 

Desc : This is the main loop for agexp utility. It make! calls to the 
arithmetic coder to retrieve the MR codes, and than builds a 
ratered binary image. 

* / 



/Include <stdio.h> 
/include <stdlib.h> 
/Include <string.h> 
/include <fcntLft> 
/Include <memory.h> 
/Include <rnalloch> 
/include <sys\types.n> 
/include <sys\stath> 
// /inciude <dos.h> 
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/include -acoder.n" 
/include "amodei.h" 
/include •abitio.h' 
/include "agcmp.h" 

find nextclnt color, int pos, cnar Mine, Int len); 

int uncompress_strip_and_save(unsigned cnar 'compressed, long compressed jize.int fdo); 
mr uncompress~(unsigned~cnar Mine, unsigned cnar *prev); 
void nuf uncompresscunsigned cnar •ilne,unsigned cnar *strt; 
void get~bit_stream(char *str,cnar *bufUong compressed_size); 
inc pacK8(unsigned cnar *iine,unsigned cnar # buf>; 
int find Mfint aO colorjnt aO.cfiar MIne,Int lengtn); 

void vertlcai^codednt *aO_cotor,int *aO,cnar 'prevjnt tengtacnar *curr,int offset); 
inc find_hufjen(int *aO_color); 



/define STRIP_SIZE 100 // can be any number, determines buffer size 



maindnt argc, cnar *argvO) 
{ 

if (argc ! - 3) { 

f prtntflstderr, *\nusage: %s C3 output_Ftle name IMC filename \n\ argvtOD; 
exit(9); 

} 

agexpcargviu argvt2D; 

} 

agexp(cnar *fnflie, cnar *outfi!e) 
{ 

cnar linetPELS perjjnei, prev iine[PELS_PER unei. *bufo; 
int fdo; 

unsigned cnar 'compressed; 
long cornpressed_stze; // In bits 
intj = 0, line nunrf- 0; 
AC header ag neader; 



If «fd0-0pen<0UtfHe, O WRONLY | 0 CREAT I 0 TRUNC | O BINARY, S IREAD | S_IWRfTS) < 1) 

BIgErr(9, 'ACEXP: Can't open outfllen; 

If ( (PufO - mall0C(5TRIP_SlZE*BYTES_PERJJNB) ~ » NULL ) 

BlgErrG, *ACEXP: no mefnl; 

memsettprevjine, 0, PELS_PERJJNE); 

initialize modelO; 
inlt mr_modelO; 

!nltfallzejnputj)fetream<lnflie, &ag_header, sizeofcagjieaden); 

initiailze~arttnmetic decoded); 

Init_get_l0; 

prlntfruNES; %ld TOTAL: %Id\n", ag_header.number_of Jlnesjn_flle, ag_header.total_bYtes); 

while ( mruncompressaine, prevjlne) != -1 ) { 
If ( line num+ + % 100 - - 0) 
printfnine %d\r, line num-1); 
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memcpYiDrevJine, line, PELS_?=3J-lNE); 
packSdine, duf o +j *BYTES_PERJJNE); 



write(fdO. bufO, BYTES_PER_UNE*STR!P_SIZE); 
j -0; 

} 
) 

if (J ! - 0) 

write<fdo r bufo, bytes_per_une 



free_amdl_bufsO; 
ctose_getJO; 
free(Dufo7; 
closetfdo); 

} 



liilillllllllllllliililltlllllllllllllilliitll^ , 
/** mis loop decompresses one rasterized line ! •*/ 
mr_uncompress(unsigned cnar Mine, unsigned cnar *prev) . 

* Int ao color - WHITE, bi, b2; 
Int ao"» 0, MaOai, Man 32, code; 



UnetOl - WHITE; // force a white pixel on line beginning 

while (aO < PELS_PER_LINE) {//while not EOL 
code « getjKMRJXNTROU; 
if (code -« EOF)" 

returnM); 
switch (code) { 
case VO" 

vertlcaLcode(SLaO_coior, &aQ, prev, pels_PERJJNE, line, 0); 
break; " 
case VR"1 : 

vertical_code(&aOj:oior, &s0, prev, PELS_FERJJNE, line, 1); 
break; 
case VL1 * 

verticaLcode(&aO_color, &s0, prev, PELS_FERJJNE, line,-1); 
break; " 
case HOR* 

Maoal - find huf ien(&a0.coior); // gpos is gioablly known 
Mala2 - ftnd~huf~len(&aO_catort; 
memset(line+a0, lojrolor, MaOai); ^ 
memsetaine+aO+MaOal, !aO_coior, Maiaz); 
aO + - (MaOal +Mana2); 
break; 
case PASS: 

bi - find bl(a0 color, aa prev, PElS .PER _UNB; 

b2 - flnd~nexttab_color, bi +1. prev, peisperunb; 

memsetUine+aO, aOjrolor, b2-a0); 

aO - b2; 

break; 
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caseVR2: 

dSf LCOde(&a °- CO ' 0r ' &a0 ' prev ' ^J^NS. line, 2); 
case VL2: 

vertlca._code( & aO_coicr. &ao. prev, PELS_per_une, line. -23; 
case VR3; 

vamni.codecaao.coior. & ao. prev. pels.per.une. ..ne. g ; 
case vu': 

vertical j:ode(^o_coior. iaO, prev, pels_per_l,ne, line, -3); 

} 
} 

returnd); 

} 

find_binnt aO.coior, Int aO, cfiar -line, int lengtn) 
intbl; 



If dinetaoi - - aO color) 

eise 1 { " flnd - nex « Ia0 - co| O'", 30 + 1 , line, lengtn); 

M - find nexttaO_color, ao+1. line, length); 
D1 - flnd_nextaaO_coior, 01 + i, i in e, lengtn); 

returnODD; 

} 

C^SfrS^? 31 1 3 stenze d "ne according to MR codes v 

void vertfcal.codeant •ao.co.c, Int -ao. cnar -preT^ leng tr, char *curr, Int offset, 
Inta-i, bi; 

ai - bi + offset; 

•ao_coior - i ( * a o colon- 
# a0 - ai; 

} 

findjiuf jendnt *aO_coion 
Int len; 

ten - bw^symbols - 1- getjicao.color); 

If aen > 63) 
len - aen - 63) * 64; 

If aen < 64) { 
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*aOj:oior - raO_color; 
returnden); 
} else 

returnden + find_nuf_len(aOj:olor)); 

} 

find nexttint color, int pos, cnar •line, int len) 
{ 

Int i; 

cnar *ptr; 

if (pos > len-1) 
returndenj; 

if c (ptr - memcnrdine + pos f color, len-pos)) - - NULL) 

return len; 
else 

return (ptr-line); 

} 

BlgErrdnt n, cnar *s) // too many bits In strip. 

* printfrErr%d-%s- # n r s); 
exitt9); 

} 



static int 'count, prev, prevl; 

/• • • * Aritnmetic decoder staff / 

init_getjo 

{ count - mallocl sizeofflno * 3); // mr + b&w 
memsettcount Or sizeofflnt) * 3); 
prev - prevl - 0; 

} 

/*•♦• Aritnmetic decoder staff •♦•*•/ 

close get 10 

{ 

free(count); 

close Input bltstreamO; 

} 

/ gets one symbol from tne aritnmetic coder */ 

get idntmode) 
{ 

SYMBOL S; 

intc 

get symbol scalet &s, mode, prev, prevl ); 

counamodei « get_currentcount( &s ); 

c - canvert_symdo7_toJnt( counamodei, &s ); 

- - MRCONTKOU { 

- prev; 
c; 



if (mode < 
prevl - 
prev - 
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) 

remove_svmboljTom_stream( &s); 

if (c !- EOF) 

updatejnodeK c); 
return(c);~ 

} 



static pack_bytes, byte; 

/*•• routines for packing bytes to bits (for output) ***/ 
pack8(unsigned cnar 'line, unsigned char *buf) 
( 

int !~0, J, k, color, new_pos, pos-0, n, bits, count-0; 

packjjytes - 0; 
byte"- 0; 
color - HnelOl; 

while ((newjsos • find nextacolor, pos, line, PELS PER LINE)) I - PELS PER UNO { 
pack_n_bits(cotor, new_pos - pos, &count buf); " 
pos - new pos; 
color « icoTor; 

} 

pack n_bits(coior, pels per line - pos, &count, buf); 

} 

pack n bltsdnt color, Int n, int 'count, char *buf) 
{ 

Int bits; 

Static b_tableO - {0,1,3,7,15,31,63,127,255}; 

while ( (*count+n) > 8) { 
if (*count ! « Q { 

bits - 8- •count- 
byte - (byte < < bits); 

If (color) 
byte + « ( (color < < bits) * 1 ); 

bufipack_bytesl - byte; 

pack bytes* +; 

r>-blts; 

*count - 0; 
}eise { 

if (colon 
byte - 255; 

else 

byte - 0; 

buflpack_bytesJ - byte; 
pack bytes ++; 
rv-8? 

} 
} 

byte - (byte < < n); 
if (colon 
byte + - b tabletnl; 
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(•count) + -n; 

If < # count - - 85 { 
buflpackbytesl » byte; 
paOc_bvtes++; 
•count - 0; 

} 



